Dynamic Discretization of Continuous Attributes

نویسندگان

  • João Gama
  • Luís Torgo
  • Carlos Soares
چکیده

Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees, on the other hand, require sorting operations to deal with continuous attributes , which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. Detecting interdependen-cies can be seen as discovering redundant attributes. This means that our method performs attribute selection as a side eeect of the discretization. Empirical evaluation on ve benchmark datasets from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Discretization of Continuous Attributes Using Dynamic Programming

The area of Knowledge discovery and Data mining is growing rapidly. A large number of methods are employed to mine knowledge. Several of the methods rely of discrete data. However, most datasets used in real application have attributes with continuous values. To make the data mining techniques useful for such datasets, discretization is performed as a pre-processing step. Discretization is a pr...

متن کامل

A dynamic-programming algorithm for hierarchical discretization of continuous attributes

Discretization techniques can be used to reduce the number of values for a given continuous attribute, and a concept hierarchy can be used to define a discretization of a given continuous attribute. Traditional methods of building a concept hierarchy from a continuous attribute are usually based on the level-wise approach. Unfortunately, this approach suffers from three weaknesses: (1) it only ...

متن کامل

Global discretization of continuous attributes as preprocessing for machine learning

Real-life data usually are presented in databases by real numbers. On the other hand, most inductive learning methods require a small number of attribute values. Thus it is necessary to convert input data sets with continuous attributes into input data sets with discrete attributes. Methods of discretization restricted to single continuous attributes will be called local, while methods that sim...

متن کامل

Discretization of Continuous-valued Attributes and Instance-based Learning

Recent work on discretization of continuous-valued attributes in learning decision trees has produced some positive results. This paper adopts the idea of discretization of continuous-valued attributes and applies it to instance-based learning (Aha, 1990; Aha, Kibler & Albert, 1991). Our experiments have shown that instance-based learning (IBL) usually performs well in continuous-valued attribu...

متن کامل

Compression-Based Discretization of Continuous Attributes

Discretization of continuous attributes into ordered discrete attributes can be beneecial even for propositional induction algorithms that are capable of handling continuous attributes directly. Beneets include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We deene a global evaluation measure for discretization...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998